K-Means for noise-insensitive multi-dimensional feature learning

نویسندگان

چکیده

Many measurement modalities which perform imaging by probing an object pixel-by-pixel, such as via Photoacoustic Microscopy, produce a multi-dimensional feature (typically time-domain signal) at each pixel. In principle, the many degrees of freedom in signal would admit possibility significant multi-modal information being implicitly present, much more than single scalar “brightness”, regarding underlying targets observed. However, measured is neither weighted-sum basis functions (such principal components) nor one set prototypes (K-means), has motivated novel clustering method proposed here. Signals are clustered based on their shape, but not amplitude, angular distance, and centroids calculated direction maximal intra-cluster variance, resulting algorithm capable learning (signal shapes) that related to underlying, albeit unknown, target characteristics scalable noise-robust manner.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Feature Representations with K-Means

Many algorithms are available to learn deep hierarchies of features from unlabeled data, especially images. In many cases, these algorithms involve multi-layered networks of features (e.g., neural networks) that are sometimes tricky to train and tune and are difficult to scale up to many machines effectively. Recently, it has been found that K-means clustering can be used as a fast alternative ...

متن کامل

K-means-based Feature Learning for Protein Sequence Classification

Protein sequence classification has been a major challenge in bioinformatics and related fields for some time and remains so today. Due to the complexity and volume of protein data, algorithmic techniques such as sequence alignment are often unsuitable due to time and memory constraints. Heuristic methods based on machine learning are the dominant technique for classifying large sets of protein...

متن کامل

Learning the k in k-means

When clustering a dataset, the right number k of clusters to use is often not obvious, and choosing k automatically is a hard algorithmic problem. In this paper we present an improved algorithm for learning k while clustering. The G-means algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution. G-means runs k-means with increasing k in a...

متن کامل

K-means Clustering with Feature Hashing

One of the major problems of K-means is that one must use dense vectors for its centroids, and therefore it is infeasible to store such huge vectors in memory when the feature space is high-dimensional. We address this issue by using feature hashing (Weinberger et al., 2009), a dimension-reduction technique, which can reduce the size of dense vectors while retaining sparsity of sparse vectors. ...

متن کامل

Hierarchical k-Means for Unsupervised Learning

In this paper we investigate how to accelerate k-means based unsupervised learning algorithms with hierarchical k-means. We show that hierarchical k-means significantly speeds up k-means based learning approaches in both the training and query phases at minimal cost to test accuracy. This speedup allows for much larger numbers of centroids to be used, which in turn leads to much better learning...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition Letters

سال: 2023

ISSN: ['1872-7344', '0167-8655']

DOI: https://doi.org/10.1016/j.patrec.2023.04.009